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ABSTRACT 

We measure the three-dimensional topology of large-scale structure in the Sloan Digital Sky Survey 
(SDSS). This allows the genus statistic to be measured with unprecedented statistical accuracy. The 
sample size is now sufficiently large to allow the topology to be an important tool for testing galaxy 
formation models. For comparison, we make mock SDSS samples using several state-of-the-art N-body 
simulations: the Millennium run of Springel et al. (2005) (10 billion particles), Kim & Park (2006) 
CDM models (1.1 billion particles), and Cen & Ostriker (2006) hydrodynamic code models (8.6 billion 
cell hydro mesh). Each of these simulations uses a different method for modeling galaxy formation. 
The SDSS data show a genus curve that is broadly characteristic of that produced by Gaussian 
random phase initial conditions. Thus the data strongly support the standard model of inflation 
where Gaussian random phase initial conditions are produced by random quantum fluctuations in 
the early universe. But on top of this general shape there are measurable differences produced by 
non-linear gravitational effects (cf. Matsubara 1994), and biasing connected with galaxy formation. 
The N-body simulations have been tuned to reproduce the power spectrum and multiplicity function 
but not topology, so topology is an acid test for these models. The data show a "meatball" shift (only 
partly due to the Sloan Great Wall of Galaxies; this shift also appears in a sub-sample not containing 
the Wall) which differs at the 2.5<r level from the results of the Millennium run and the Kim & Park 
dark halo models, even including the effects of cosmic variance. 

Subject headings: cosmology:observations — large-scale structure of universe — methods: numerical 



1. INTRODUCTION 

The topology of large scale structure in the universe 
is an important physical property of the matter density 
field that can be compared with the prediction of the sim- 
ple inflationary models (Guth 1981; Linde 1983) where 
Gaussian random phase initial conditions are generated 
from quantum fluctuations in the early universe. An- 
alytic tools for quantitatively analyzing the topology of 
large scale structure in three dimensions have been devel- 
oped during the past 20 years (Gott, Melott, & Dickinson 
1986; Hamilton, Gott, & Weinberg 1986; Gott, Wein- 
berg, & Melott 1987, Gott et al. 1989; Vogeley, et al. 
1994; Park, Kim, & Gott 2005; Park et al. 2005). The 
distribution of galaxies in space is smoothed to construct 
isodensity contour surfaces whose topology may be com- 
puted. Our genus statistic — described below — quantifies 
the topology of isodensity contours. 

On smoothing scales larger than the correlation length, 
the fluctuations are still in the linear regime and since 
fluctuations in the linear regime grow in place without 
changing topology, the topology we measure now should 
reflect that of the initial conditions, which should be of 
Gaus sian random phase according to t he theory of infla- 
tion (iGott, Weinberg fr Melott We have shown 
this in d etail by comparison with large N-body sim- 
ulations (|Gott. Weinberg fc Melott 1 11987). We expect 
sponge-like topology at the median density contour to 
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be a strong prediction of inflation (Gott, Melott, and 
Dickinson 1986, Gott, Weinberg, Melott 1987). Previ- 
ous models of galaxy clustering suggested either a meat- 
ball topology — isolated clusters growing in a low den- 
sity connected background as suggested by the Press 
and Schechter (1974) formalism or by hierarchical galaxy 
formation (Peebles 1974, Soneira & Peebles 1978) — or a 
Swiss cheese topology — isolated voids surrounded on all 
sides by walls as suggested by Einasto, Joeveer, & Saar 
(1980). But with Gaussian random phase initial condi- 
tions we expect a sponge-like topo logy as pointed out by 
iGott. Melott. fc Dickinson I ((19861) . 

Studies of many observational samples have been con- 
ducted by our group and others, which have shown in 
every case a sponge-like median density contour as ex- 
pected from inflation. For notable examples, see Gott, 
Melott, & Dickinson (1986), Gott et al. (1989), Moore et 
al. (1992), Vogeley et al. (1994), Canaveses et al. (1998), 
Hikage et al. (2002), Hikage et al. (2003), and Park et 
al. (2005). In addition, in all cases the observed genus 
curve was reasonably well-fit by the Gaussian random 
phase theoretical curve in equation 21 Perhaps the most 
spectacular such agreement was seen in the Caneveses 
et al. (1998) analysis of the 15,000 galaxy PSCz redshift 
survey. This study showed quite a good fit (within the 
noise) to the random phase curve at a variety of smooth- 
ing lengths. The IRAS galaxies in this sample are pri- 
marily low mass spiral and irregular galaxies and so may 
suffer less biasing effects t han ga l axies from an optically 
selected sample. See iPark et all (j2006) for discussion of 
the strong dependence of morphological fraction on den- 
sity in the SDSS. 

A two-dimensional variant of the genus statistic can 
also be applied to redshift slices, sky maps, and CMB 
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maps (Coles 1988, Melott et al. 1989, Gott et al. 1990, 
Park et al. 1998), In this case, G(v) = # of hot (or high 
density) spots — # of cold (or low density) spots, and 
the Gaussian random phase hypothesis implies g{y) oc 
z/exp(— v 2 /2). All of these studies (Redshift Slices: Park 
et al. 1992; Colley 1997; Hoyle, Vogeley, & Gott 2002; 
Hoyle et al. 2002; Sky Maps: Gott, Mao, Park, & Lahav 
1992; Park, Gott, & Choi 2001; CMB Maps: Smoot et 
al. 1994; Colley, Gott, & Park 1996; Kogut et al. 1996; 
Colley fc Gott 2003; Park 2003; Spergel et al. 2006; 
iGott et aI1l20 06) indicate consistency with the Gaussian 
random phase hypothesis. The CMB maps are a partic- 
ularly powerful test of the Gaussian random phase hy- 
pothesis because the fluctuations are still firmly in the 
linear regime. The dramatic agreement between the 2D 
CMB results and the Gaussian random phase hypothesis 
strongly supports idea of the standard theory of infla- 
tion and that the initial conditions were truly Gaussian 
random phase. 

The three-dimensional topology data on galaxy clus- 
tering are particularly interesting because they allow us 
not only to confirm their general Gaussian random phase 
nature (on large scales), but also to test for non-linear 
processes and bias involved with galaxy formation. Mat- 
subara (1994) has discussed how the genus curve may 
be altered by second-order non-linear gravitational clus- 
tering effects, which can show up at small smoothing 
lengths. Vogeley et al. (1994) explicitly measured the 
diminution of the genus amplitude caused by these ef- 
fects. Park, Kim, & Gott (2005) have studied other im- 
portant alterations which can occur by non-linear gravi- 
tational evolution, redshift space distortion, and biasing 
associated with galaxy formation. 

In this paper we compute genus curves for volume- 
limited samples of the largest galaxy redshift survey to 
date, the Data Release 5 (DR5) of the SDSS, and for 
mock samples from state-of-the-art N-body simulations. 
Our goal is to examine whether the models for galaxy 
formation represented by these simulations are consis- 
tent with the observations. This is a potentially power- 
ful test, because the input parameters of the flat ACDM 
model used in these simulations were determined by fit- 
ting to a host of other observations — CMB anisotropy 
data, large-scale power spectrum and correlation func- 
tion of galaxies, SNela luminosity-distance data, cluster 
abundances and baryon fraction, etc. — but not topology 
of large-scale structure in the galaxy distribution. As the 
basic underpinnings of the model become more secure, 
we can turn to more precise testing of models for the 
physics of galaxy formation. Likewise, the methods and 
parameters for simulating galaxies have been tuned to 
match other observations, but not topology. Thus, this 
comparison provides an independent test of the model 
for structure formation. 

2. THE GENUS AND RELATED STATISTICS 

The genus is a measure of the topology of the large 
scale distribution of galaxies. We first smooth the point 
distribution of galaxy positions (we use only volume- 
limited samples in the analysis below) with a Gaussian 
smoothing ball of radius A 



W(r) 



1 



where A is chosen to be greater than or equal to the cor- 
relation length. In this paper we choose A = 6/z -1 Mpc 
which is approximately equal to the galaxy correlation 
length. This smallest scale yields the highest resolution 
measure of the three-dimensional topology and the great- 
est statistical power because of the large number of reso- 
lution elements. This scale also gives the greatest amount 
of information about non-linear gravitational effects and 
biasing involved in galaxy formation. 

We establish density contour surfaces labeled by 
where the volume fraction on the high density side of 
the density contour surface is /: 



/ 



i 



~ x2 / 2 dx. 



(2) 



The genus as a function of v is given by 

G(v) = # of donut holes — # of isolated regions (3) 

(Gott, Melott, and Dickinson, 1986). Thus, an isolated 
cluster has a genus of —1 by this definition. We have 
shown that G(v) is also equal to minus the integral of the 
Gaussian curvature over the area of the contour surface 
divided by 47r, which enables us to measure the genus 
with a computer program (CONTOUR 3D) (see Gott, 
Melott, & Dickinson 1986 and Gott, Weinberg, & Melott 
1987). 

For a Gaussian random phase density field, the genus 
per unit volume, g{y) = G(z/)/V, is given by 



g(v)=A(l-v 2 )e 



-v 2 /2 



(4) 



where the amplitude A = ((k 2 ) /3) 3/2 /(2tt 2 ) depends 
only on the average value of k 2 integrated over the 
smoothed power spectrum (Hamilton, Gott, & Wein- 
berg 1986; Adler 1981; Doroshkevich 1970; Gott, Wein- 
berg, & Melott 1987). Thus, the amplitude A [units: 
genus/ (/i _1 Mpc) 3 ] can tell us about the primordial power 
spectrum. For a Gaussian random field, the median den- 
sity contour {y = 0, / = 50% volume enclosed) exhibits 
a sponge-like topology (many holes and no isolated re- 
gions); the / = 7% high density contour (y = 1.5) shows 
isolated clusters, while the / = 93% density contour 
(y = —1.5) is dominated by isolated voids. We call the 
curves G(y) and g{y) (which differ only by a constant 
factor for a given sample) the "genus curves" . 

For the purpose of examining departures of the ob- 
served genus curve from the Gaussian random phase pre- 
diction, we parameterize the genus curve by several de- 
rived quantities. First is the best-fit amplitude, 



A = amplitude of the genus curve, 



(5) 



(2tt) 3 / 2 



2 /2A 2 



(1) 



which we measure by least squares fit of the theoretical 
random phase curve to the data, fitting only in the range 
— 1 < v < 1. For the random phase case, this ampli- 
tude is proportional to ((k 2 ) 3 / 2 ) of the smoothed power 
spectrum and so gives information about the primordial 
power spectrum. For observations, this amplitude ap- 
pears lower because of non-linear clustering and biasing 
due to coalescence of structures (Park & Gott 1991b; 
Vogeley et al. 1994; Canavezes et al. 1998) 

We quantify shifts and deviations of the genus curve 
from the shape of the random phase curve using the fol- 
lowing three variables. We measure horizontal shifts of 
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the central part of the genus curve with 



(6) 



J_i#rfO) d " 

where g T ^{y) is the genus of the random phase curve fol- 
lowing the formula in equation QJ using the fitted ampli- 
tude A above. The theoretical curve (Eq.0J) has Av = 0. 
A negative value of Av is called a "meatball shift" as it 
is caused by a greater prominence of isolated connected 
high-density structures which push the genus curve to 
the left. A positive value of Av is called a "bubble shift" 
as it can be caused by a greater prominence of isolated 
voids, and might be produced by isolated explosions (Os- 
triker and Cowie 1981) as opposed to inflation. A slight, 
statistically significant "meatball shift" (Av < 0) was 
observed first by Gott et al. (1989), who examined the 
CfA, Giovanelli & Haynes, and Tully datasets. In hind- 
sight, one can see a slight "meatball shift" in the very 
first genus curve ever measured (Gott, Weinberg, Melott 
1987) and this "meatball shift" was also seen for brighter 
galaxies in an analysis of an earlier sample of the SDSS 
(Park et al. 2005). This shift is presumably due to non- 
linear galaxy clustering and bias associated with galaxy 
formation. 

To quantify departures of the observed genus from the 
random phase prediction in the region of the genus curve 
where isolated voids should dominate, we measure 

j:^g(u)du 



A v — 



1-2.2 9 r f^) dv 



where g T f(v) is again the genus of the best fit random 
phase curve following t he formula in equation EI (see 
iPark. Kim, fc Go tt 2005 for a n explanation of th e choic e 
of range in v). As shown in lPark. Kim, fc Gottl ((2005), 
a value of A v < 1 can be the result of biasing in galaxy 
formation because voids are very empty and can coalesce 
into a few larger voids. A v is sensitive to the number of 
isolated voids but the density contour (at v = —1.7 for 
example) is given by the volume fraction, so if A v is less 
than 1, and by definition there is the same volume in the 
low density regions being measured, there must therefore 
be fewer but larger voids. Non-linear clustering alone at 
these scales predicts a value of A v > 1 for the power 
spectrum of the AC DM model we adopt (see Figure 1 
of Park, Kim, & Gott 2005), so observing A v < 1 may 
be an indication of bias in galaxy formation. 

Similar to A Vl we measure a quantity A c that char- 
acterizes departure from random phase behavior in the 
part of the genus curve expected to be sensitive to the 
number of isolated high-density regions (clusters), 



ll'lgiy) dv 
/i 2 2 0rfM dv 



(8) 



A value of A c < 1 may occur because of non-linear clus- 
tering, when clusters collide and merge. Also if there is 
a single large connected structure like the Sloan Great 
Wall, this can also lower the value of A c . Also, as Park, 
Kim, & Gott (2005) have shown, the Matsubara (1994) 
formula for second-order gravitational non-linear effects 
has the result that A v + A c = 2 at all scales, so if we 
observe both A v and A c to be less than 1, (as we find 
below to be the case) biased galaxy formation must be 
involved. 



3. N-BODY SIMULATIONS OF LARGE-SCALE STRUCTURE 

Before we confront results of current simulations of the 
flat ACDM model with the best observations of the topol- 
ogy of the galaxy distribution currently available, it is 
instructive to consider the remarkable success to date of 
large N-body simulations in modeling large-scale struc- 
ture. It is encouraging that as the volume and resolu- 
tion in N-body simulations have grown with the size and 
quality of observational data sets, that the agreement has 
become even more spectacular — perhaps a sign that we 
are on the right track with these models. 

Peebles did the first large N-body simulation for cos- 
mology using 1000 dark matter particles with Q m = 1 
and Poisson initial conditions. It showed clusters like the 
Coma cluster forming from random fluctuations by grav- 
itational instability and a reasonable covariance func- 
tion. Aarseth, Gott and Turner (1973) used 4,000 par- 
ticles with initial conditions that had more power on 
large scales than Poisson (index n = —1). They found 
power law covariance functions quite like those observed 
even for models with Q m < 1 and n = — 1 (Gott, 
Turner, & Aarseth 1979, Gott & Turner 1979) as origi- 
nally proposed theoretically by Gott & Rees (1975). (In- 
deed, inflationary flat lambda models popular today have 
Q m < 1 and more power on large scales than Poisson 
initial conditions, just as these early simulations sug- 
gested.) They also found voids as large as those ob- 
served. The largest voids had volumes such that at the 
mean density they would have contained as much mass 
as the Coma type clusters contained. This was reason- 
able from theoretical considerations of non-linear clus- 
tering, considering cluster (Gunn and Gott 1972) and 
void (Bertschinger 1985, Fillmore & Goldreich 1984) for- 
mation from small fluctuations via gravitational insta- 
bility. In Gaussian random phase initial conditions, iso- 
lated over- and under-dense regions in the initial con- 
ditions should be equal in mass leading to equal mass 
great clusters and empty voids lacking the same amount 
of mass. They also found that such Q m < 1 models with 
more power at large scales than Poisson produced bet- 
ter multiplicity functions than ^ m = 1 Poisson models 
(Gott & Turner 1977; Bhavsar, Gott, & Aarseth 1981). 

The advent of inflation brought for the first time realis- 
tic theoretical power spectra to input into N-body mod- 
els. Together, inflation and Cold Dark Matter specified 
reasonable initial conditions. A suite of such simulations 
by Davis et al. (1985) provided an impressive match to 
many aspects of the observed large-scale structure. 

Just as theory seemed to be converging on the now- 
disproven "standard CDM model," the observations pro- 
vided a shock. De Lapparent, Geller & Huchra (1986) 
found many voids 50/z -1 Mpc across. This caused a num- 
ber of people to abandon Gaussian random phase initial 
conditions and gravitational instability — favoring explo- 
sions to produce the voids instead (Ostriker & Cowie 
1981). Then Geller & Huchra (1989) discovered the CfA 
Great Wall of galaxies, which surprised everyone. This 
result was announced at an IAU conference in Rio de 
Janeiro. Many people said that was the end for random 
phase initial conditions, for one expected the covariance 
function to die at a scale of about 30/i _1 Mpc and here 
was a structure that was 150/z -1 Mpc long. "Perhaps it 
was produced by cosmic string wakes," was one comment 
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made at the time (though not by us). 

However, the jump to abandon random phase initial 
conditions ignored the fact that no N-body simulations 
had been done by that time that were large enough to 
properly model structures as large as the CfA Great Wall. 
When Park (1990) did such simulations using 4 million 
particles, simulating such a volume for the first time, 
the results showed that such great walls form routinely. 
In fact, a 20° thick slice survey through the simulations 
was a near perfect visual match to the Geller and Huchra 
survey. These simulations used a standard peak biasing 
scheme and included both standard CDM and Q m = 0.4 
models. Narrow 6° thick slices showed prominent large 
voids like those in the De Lapparent, Geller & Huchra 
slice and great walls appeared in 20° thick slices. This 
simulation showed weak narrow walls and filaments of 
galaxies inside the voids, as seen in the CfA data (Park, 
Gott, Melott, & Karachentsev 1992). 

In similar fashion, N-body dark matter simulations 
large enough to mimic the deep pencil beam surveys 
of Broadhurst, Ellis, Koo, & Szalay (1990) showed ap- 
parently regular spikes (walls) of galaxies just like those 
observed (Park & Gott 1991a). And large N-body sim- 
ulations (Park 1991) showed great attractors just like 
that observed by Lynden-Bell et al. (1988). Great re- 
pulsors are not seen because such peaks in the gravita- 
tional potential occur in the middle of large voids where 
there are too few tracer galaxies. N-body simulations in- 
cluding hydrodynamics have been successful in modeling 
the Lyman Alpha forest (Cen, Miralda-Escude, Ostriker, 
& Rauch 1994; Hernquist, Katz, Weinberg, & Miralda- 
Escude 1996). 

Prior to this study we did an analysis of the topology 
of a large N-body computer simulation made to mimic 
the Sloan Digital Sky Survey (Colley et al. 2000). This 
54 million particle simulation was observed from one lo- 
cation to simulate what will be seen by the Sloan Digital 
Sky Survey, to produce sky maps, slices, and 3D topol- 
ogy maps, to show the power of the survey. The sky map 
looked astonishingly like real sky maps made to similar 
depth, and the slice maps looked quite lik e similar survey 
maps made in the Las Campanas Survey ( Kirs hner et alJ 
1981) and now seen in the SDSS. The cosmological model 
for this simulation was the fiat ACDM model which re- 
mains in favor. 

Even larger simulations are available today and we are 
interested to see how they fare in their ability to model 
the topology of large-scale structure. The "Millennium 
Run" (MR hereaf ter), using over 10 bi llion (2160 3 ) dark 
matter particles ({Springel et al J 12005) and surveying a 
cube of side length 500/z -1 Mpc, has shown structures re- 
markably like the Great Wall found by Geller & Huchra 
(1989), and even wall complexes somewhat resembling 
the Sloan Great Wall which Gott et al. (2006) measured 
to span 1.37 billion light y ears (Springel, Frenk & White 
2006). Indeed Figure 1 in Springel et al. ( 2006) shows a 
remarkable visual agreement between what is seen in the 
Millennium Run and in slices of the CfA, the 2dF sur- 
vey and the SDSS. The most noticeable difference is that 
the Sloan Great Wall looks much more visually promi- 
nent and coherent than the longest chain of walls found 
in the MR. (In this simulation, the box size of 500/z -1 
Mpc cuts off the power spectrum at larger scales. If a 
simulation were to be made with a larger box size, it 



would have more power at these larger scales and there- 
fore could more easily produce large coherent structures 
like the Sloan Great Wall.) The MR computes dark mat- 
ter halo formation merger trees and uses a semi-analytic 
model to simulate the galaxy-formation process where 
star formation and feed back are mode l ed by simple an- 
alytic physical models. iCroton et alJ (|2006l) have pro- 
duced mock galaxy samples of their cube that include 
galaxies brighter than the Magellanic clouds, including 
absolute magnitudes on the SDSS system, which allow 
us to make mock SDSS galaxy samples. 

Park, Kim & Gott (2005) produced 8.6 billion 
particle (2048 3 ) simulations that cover volumes of 
(1024/*- 1 Mpc) 3 and (5632fr- 1 Mpc) 3 . These simulations 
employ a Halo Occupation Distribution (HOD) method 
to place an appropriate number of galaxies in heavy ha- 
los identified by the PSB (Kim & Park 2006) and FoF 
techniques. These simulations were used to analyze the 
effects of galaxy formation and bias on topology by Park, 
Kim, & Gott (2005). More recently, Kim & Park have 
produced 1.1 billion particle simulations covering a vol- 
ume of (614/i- 1 Mpc) 3 . Here they use a new technique to 
identify physically bound dark matter halos (not tidally 
disrupted by larger structures) at the present epoch and 
identify these with galaxies (we call these the DH simula- 
tions, for Dark (matter) Halos). They too have produced 
magnitudes for these mock galaxies on the SDSS system 
by matching the halo mass function with the luminosity 
function of the SDSS galaxies. 

Cen & Ostriker (2006) have run hydrodynamic simula- 
tions covering a smaller cube of (120/i _1 Mpc) on a side 
using an 8.6 billion (2048 3 ) cell hydro mesh, with (1024 3 ) 
dark matter particles. Here the galaxy formation process 
is simulated with a hydrodynamic code that identifies 
collapsing regions, calculates star formation rates, and 
includes radiative cooling/heating, UV background ra- 
diation with local attenuation, and supernova feedback 
associated with star formation. Again, some assump- 
tions about star formation are made, but this model has 
one of the most detailed and direct physical calculations 
of the galaxy formation process available for any simula- 
tion that spans a cosmologically-interesting volume. For 
further details of the simul a tion, w e refer the readers to 
iCen. Nagamine fc Ostriker] |2P05). Nagamine has pro- 
duced the mock catalogs giving absolute magnitudes on 
the SDSS system from the Cen & Ostriker (2006) hydro 
simulations. 

The MR simulation, the Kim & Park DH simula- 
tion and the Cen & Ostriker hydro simulation repre- 
sent state-of-the-art simulations for different schemes to 
mimic galaxy formation. While the astronomical com- 
munity seems to be converging on a standard model for 
cosmology — the flat ACDM model (see, e.g., Reiss et al. 
1998, Perlmutter et al. 1999, de Bernadis et al. 2000, and 
Spergel et al. 2006) — galaxy formation remains an un- 
solved problem. This means that since only one cosmo- 
logical model need be simulated, larger N-body runs ex- 
ploring different galaxy formation scenarios from differ- 
ent teams can be run. Since the parameters in the semi- 
analytic models have been tuned to account for other fea- 
tures such as covariance function and multiplicity func- 
tion, and topology was not considered, topology is a par- 
ticularly stringent test. If the models produce the right 
topology automatically, it would constitute dramatic ev- 
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idence that their galaxy formation scenarios were on the 
right track. In any case, a successful model must show 
the universe in all its features, including topology. 

4. SLOAN DIGITAL SKY SURVEY DATA 

The SDSS (York et al. 2000; Stoughton et al. 2002; 
Adelman-McCarthy et al. 2006) is a survey to explore the 
large scale distribution of galaxies and quasars by using 
a dedicated 2.5m telescope (Gunn et al. 2006) at Apache 
Point Observatory. The photometric survey has imaged 
roughly tt steradians of the Northern Galactic Cap in 
five photometric bandpasses denoted by g, r, z, and 
z centered at 3551,4686,6165,7481, and 893lA, respec- 
tively, by an imaging camera with 54 CCDs (Fukugita 
et al. 1996; Gunn et al. 1998). The limiting magni- 
tudes of photometry at a signal-to-noise ratio of 5 : 1 
are 22.0, 22.2, 22.2, 21.3, and 20.5 in the five bandpasses, 
respectively. The median width of the PSF is 1.4", and 
the photometric uncertainties are 2% RMS (Abazajian 
et al. 2004). See Ivezic et al (2004) for details of assess- 
ment of photometric quality and Tucker et al. (2006) for 
discussion of the monitor telescope pipeline employed for 
calibration. 

After image processing (Lupton et al. 2001; Stoughton 
et al. 2002; Pier et al. 2003) and calibration (Hogg et al. 
2001; Smith et al. 2002), targets are selected for spectro- 
scopic follow-up observation. The spectroscopic survey 
is planned to continue through 2008 as the Legacy sur- 
vey and yield about 10 6 galaxy spectra. The spectra 
are obtained by two dual fiber-fed CCD spectrographs. 
The spectral resolution is A/AA ~ 1,800, and the RMS 
uncertainty in redshift is ~ 30 km/s. Because of the me- 
chanical constraint of using fibers, no two fibers can be 
placed closer than 55" on the same tile. Mainly due to 
this fiber collision constraint, incompleteness of the spec- 
troscopy survey reaches about 6% (Blanton et al. 2003a) 
in such a way that regions with high surface densities 
of galaxies become less prominent even after adaptive 
overlapping of multiple tiles. This angular variation of 
sampling density is accounted for in our analysis. 

The SDSS spectroscopy yields three major samples: 
the main galaxy sample (Strauss et al. 2002), the lumi- 
nous red galaxy sample (Eisenstein et al. 2001), and the 
quasar sample (Richards et al. 2002). The main galaxy 
sample is a magnitude-limited sample with apparent Pet- 
rosian r- magnitude cut of m rj ii m « 17.77 which is the 
limiting magnitude for spectroscopy. It has a further cut 
in Petrosian half-light surface brightness /iR5o, limit = 24.5 
mag/arcsec 2 . More details about the survey can be found 
on the SDSS web site 5 . 

In our study, we use a subsample of SDSS galaxies 
known as the New York University Value-Added Galaxy 
Catalog (NYU-VAGC; Blanton et al 2005). This sam- 
ple is a subset of the recent SDSS Data Release 5. One 
of the products of the NYU-VAGC used here is Large- 
Scale Structure sample DR4plus (LSS-DR4plus). We 
use galaxies with in the boundaries shown in Figure 1 of 
IP ark et all (|2006[ ). which improves the volume-to-surface 
area ratio of the survey (important when smoothing). 
There are also three stripes in the Southern Galactic Cap 
observed by SDSS. Density estimation is difficult within 
these narrow stripes, so we do not use them. 

5 http : / / www . sdss . org/ dr5/ 
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Fig. 1. — 50% high volume contours from three galaxy 

surveys across three deca des. From left to r ight, they are 

IGott. Melott. Dickinson! IT986F) . IVogelev et ahl (Q99J), and the 
present work. 



The remaining survey region covers 4,471 deg 2 (1.362 
steradians). The primary sample of galaxies used here is 
a subset of the LSS-DR4plus sample referred to as voidO, 
which is further selected to have apparent magnitudes in 
the range 14.5 < r < 17.6 and redshifts in the range 
0.001 < z < 0.5. These cuts yield a sample of 312,338 
galaxies. The roughly 6% of targeted galaxies which do 
not have a measured redshift due to fiber collisions are 
assigned the redshift of their nearest neighbor. 

Completeness of the SDSS is poor for bright galaxies 
with r < 14.5 because of both the spectroscopic selection 
criteria (which exclude objects with large flux within the 
three arcsecond fiber aperture; the cut at r = 14.5 is an 
empirical approximation of the completeness limit caused 
by that cut) and the difficulty of obtaining correct pho- 
tometry for objects with large angular size. For these 
reasons, analyses of SDSS galaxy samples have typically 
been limited to r > 14.5; using the magnitude limits of 
the voidO sample, the range of absolute magnitude is 
only 3.1 at a given redshift. 

The comoving distance and redshift limits of the 
volume-limited sample we analyze are determined from 
absolute magnitude limits obtained by using the formula 

m r , lim -M r , lim = 51og((l + 2)r) + 25 + i?(2)+£(2), (9) 

where K{z) is the mean K-correction, E(z) is the mean 
luminosity evolution correction, and r is the comoving 
distance corresponding to redshift z. We adopt a flat 
ACDM cosmology with density parameters Q\ = 0.73 
and Qm — 0.27 to convert redshift to comoving distance. 
To determine sample boundaries we use a polynomial fit 
to the mean if-correction, 

i?(z)=3.0084(z-0.1) 2 (10) 
+1.0543(2 - 0.1) - 2.5 log(l + 0.1). 

We apply the mean luminosity evolution correction given 
by Tegmark et al. (2004), E(z) = 1.6(2 - 0.1). The 
rest-frame absolute magnitudes of individual galaxies are 
computed in fixed bandpasses, shifted to z = 0.1, using 
Galactic reddening corrections (Schlegel 1998) and K- 
corrections as described by Blanton et al. (2003b). This 
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means that a galaxy at z — 0.1 has a if -correction of 
-2.51og(l + 0.1), independent of its SED. 

From this sample, we construct a volume-limited sam- 
ple containing galaxies brighter than absolute magnitude 
M r = —20.2 and fainter than M r = —21.7, and spanning 
comoving distance from 171. 3/z -1 Mpc to 344. 5/z -1 Mpc 
(corresponding to z = 0.0578 — 0.1178). This observa- 
tional sample is similar to the BEST sample studied in 
Park et al. 2005, but now larger in extent. 

This volume-limited sample contains 70,781 galaxies 
(before overlap correction) and has a mean galaxy sepa- 
ration of 6.097/z -1 Mpc, so we can safely apply a Gaus- 
sian smoothing length of 6/z -1 Mpc. Numerous numeri- 
cal experiments have shown that if the smoothing length 
is smaller than l/y/2 0.71 times the mean inter-galaxy 
separation there can be a "meatball shift" due to the al- 
gorithm picking out individual galaxies as isolated high 
density regions (Gott, Weinberg, & Melott 1987, Gott 
et al. 1989). In this sample the smoothing length is ap- 
proximately equal to 0.98 times the mean interparticle 
separation so this shot noise effect should be small. In 
any case, this is not critical for our analysis because we 
compare the observations directly with mock galaxy cat- 
alogs from N-body simulations. These mock catalogs are 
constructed to cover exactly the same range in absolute 
magnitude as seen in the observations and so contain 
very nearly (within a few percent) the same total num- 
ber of galaxies in the sample. Because the techniques 
being applied to the observations and the N-body simu- 
lation mock catalogs are identical, the results should be 
identical (within statistical variation) if the N-body sim- 
ulations are correctly modeling the distribution of galax- 
ies. 

5. TOPOLOGY OF LARGE-SCALE STRUCTURE IN THE 

SDSS 

In a previous paper (Park et al. 2005) we analyzed the 
three-dimensional topology of large-scale structure in the 
SDSS at a range of smoothing lengths and compare this 
with theoretical expectations. In the present paper we 
focus on results with a smoothing length of 6/z -1 Mpc, 
which yields the most resolution elements and gives the 
most important information on galaxy formation. The 
sample of galaxies available has now grown significantly 
larger and so we are now able to make direct comparison 
of this sample with large N-body simulations and their 
various methods of modeling of galaxy formation. As we 
shall see, the observational sample is now large enough 
that the topology, as measured by the genus curve g(v), 
is now a powerful tool for testing models of galaxy for- 
mation. 

Figure shows the progression by date of survey of the 
3D topology of selected galaxy redshift surveys. All have 
similar smoothing lengths of 5 — 6/i _1 Mpc and show 
the median density contour surface with the high den- 
sity regions shown as solid and the low density regions 
as empty. According to standard inflationary theory this 
median density contour should be spongelike. The small 
cube on the left shows the 3D region studied by Gott et 
al. (1986). The earth is at the lower front right corner 
of the cube. The topology is spongelike and the Virgo 
cluster is included in the high density region. The larger 
region of isodensity contours in the center of this figure 
is from the the CfA redshift survey (Vogeley et al. 1994). 




V 



Fig. 2. 7% low (blue) 50% (green), and 7% high (red) volume 
contours in our SDSS sample. The Sloan Great Wall is visible as 
the long red structure in the lower slice. 



The earth is at the center and the upper fan shaped re- 
gion is in the North Galactic Hemisphere while the lower 
fan shaped region lies within the South Galactic Hemi- 
sphere. The Great Wall noted by Geller & Huchra (1989) 
can be seen connecting high density regions across the 
top fan-shaped region. Again the topology is spongelike, 
with the high density regions all connected together, and 
the low density regions also connected in an interlocking 
pattern. 

Finally, the portion of the SDSS data now available (in 
2006, a full twenty years after the first figure) is shown 
on the right of Figure Q This is the largest region yet 
studied for topology and contains nearly 400,000 galaxies 
in total. The location of the earth is at the back. The 
horizontal slice extending out toward us is the north- 
ern equatorial slice of the SDSS and includes the Sloan 
Great Wall (Gott et al. 2005). The upper slice is a sec- 
ond contiguous thick region in the northern hemisphere 
of the SDSS. (When SDSS-II is complete in 2008, the 
gap between these two slices will be filled in.) It is easy 
to see that the topology of this median density contour 
is spongelike. The high density regions (taking up half 
the volume) form one multiply connected region (shown 
solid) and the low density regions (taking up the other 
half of the volume) also form one multiply connected re- 
gion that is interlocking with the high density region. 

Figure shows the same regions of the SDSS, but with 
isodensity contours in different colors for different vol- 
ume fractions. The 7% high density regions — containing 
the highest density 7% of the volume — are solid red. The 
end of the Sloan Great Wall can be seen as the red struc- 
ture snaking from the left to the right in the Equatorial 
slice. This red contour also shows isolated high density 
regions (clusters) as expected from the random phase 
genus curve. The 50% high density contour is shown in 
transparent green — this contour is a multiply connected 
spongelike surface that divides the high density half of 
the sample from the low density half. The 7% low den- 
sity regions are shown as solid blue and show isolated 
voids. The red and blue regions lie on opposite sides of 
the transparent green spongelike surface. 
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Fig. 3. — Uncrossed (upper) and crossed (lower) images of the 50% density contour for our sample of the SDSS. These images are 
displayed with separation equal to the separation between your eyes. To view the upper pair, touch your nose to the page between the 
two views. The pictures will look blurry but each stereo view will be directly in front of the proper eye. Slowly raise your nose from the 
paper keeping the blurry images fused into one blurry stereo view. As you back away to reading distance you will be able to bring these 
two fused views into focus so that you can see the 3D image clearly. (A stereo viewer may also be used.) To see the lower pair, cross your 
eyes by looking at an object (finger or pen tip) held in front of the page, moving it until you see three images, then shift your gaze to the 
central image. 



To facilitate viewing the three-dimensional nature of 
the density contours, Figure |3| shows a stereo pair of the 
same SDSS contours. This offers our best picture yet of 
the 3D topology of large scale structure in the universe. 

Figure 21 shows the genus curve of this volume-limited 
SDSS sample smoothed at A = 6/z -1 Mpc. In this fig- 
ure we compare the observed genus curve with results 
for mock surveys produced from the N-body simulations, 
which we describe in Section |S| below. Also shown in Fig- 
ure Q] is the random phase genus curve that best fits the 
SDSS genus curve. The data approximately follow this 
random phase curve, as expected from inflation. How- 



ever, there are measurable departures that are likely to 
have been caused by non-linear effects and galaxy forma- 
tion. We characterize these differences from the random 
phase curve using the measures A, Az/, A v , and A c de- 
scribed above and plot these values in Figures and El 

Figure shows the (Az/, A) plane with the SDSS values 
of (Az/, A) plotted as a solid blue square, while the black 
circle indicates the random phase prediction of Az/ = 
(the amplitude of this point is not that of a random phase 
distribution, but rather is fixed to be the same as the 
SDSS data). 

The two blue X's show the values computed from the 
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Fig. 4. — Genus curves for the SDSS sample, hydro sample, and 
random samples drawn from the 100 DH and 50 MR studies. The 
Gaussian random curve is shown for comparison. Notice that we 
plot G(y) (i.e., do not divide by the sample volume) in order to 
show the genus of the entire sample at each v. 



two SDSS slices separately — as if each were the only sam- 
ple studied. In this case, the median density contour is 
calculated for each sub-sample separately, thus each has 
a somewhat different density level at the median contour 
because presence of the Sloan Great Wall in the equa- 
torial sample guarantees that the median density in this 
sample is greater than in the northern sample. The dif- 
ference between the genus parameters of these two sub- 
samples provides a rough measure of the cosmic variance. 
The values of the genus amplitude, A, for the two sub- 
samples are of course smaller, which we correct for by 
dividing each by the respective fraction of the entire vol- 
ume it represents. 

The observed genus curve in Figure El is displaced 
slightly to the left in the central regions (a slight meatball 
shift, or a prominence of clusters over voids). This gives 
the curve a negative value of Av as indicated by equa- 
tion El The whole sample has a value of Av = —0.08. 
The SDSS Great Wall itself is a very prominent simply- 
connected high density region and so by itself can cause a 
meatball shift. Indeed, the equatorial sample which con- 
tains the Sloan Great Wall has a value of Av = —0.11. 
However the northern sample which does not contain the 
Sloan Great Wall has by itself a value of Av = —0.08. 
Thus, the meatball shift in the data is a general phe- 
nomenon and is not due solely to the Sloan Great Wall. 

Such a meatball shift has been seen in observational 
samples before. It was first noticed and commented on 
by Gott et al. (1989) who examined the CfA, Giovanelli 
& Haynes, and Tully samples. Gott, Cen, and Ostriker 
(1996) found that hydrodynamic simulations predicted 
a meatball shift for (early type) elliptical galaxies rela- 
tive to (late type) spiral galaxies (elliptical galaxies tend 
to congregate more in isolated rich clusters), an effect 



Fig. 5. — Plot of the genus shift parameter Au versus the genus- 
curve amplitude A for our samples. The black circle corresponds 
to Gaussian random phase. The blue square is the SDSS sample, 
with the two blue Xs representing each of the two SDSS subregions 
(normalized in amplitude to the volume of the whole sample). The 
green hexagon is the mean of the 100 DH mock Sloans, with er- 
ror bars representing the standard deviation of that sample. The 
red pentagon is the mean of the 50 MR mock Sloans, with error 
bars showing that standard deviation. The pink X denotes the 
hydrodynamic simulation of Cen & Ostriker; the error bars corre- 
spond to the cosmic variance of (120/1- 1 Mpc) 3 subre gions within 
a (500/z- 1 Mpc) 3 box. 



later observed in the 2D topology analysis of the SDSS 
by Hoyle et al.(2002) where a meatball shift in red (early 
type) galaxies was seen relative to blue (late type) galax- 
ies. Thus, it is clear that galaxy formation processes can 
produce meatball shifts relative to the random phase 
curve, as we observe with high statistical significance. 
The question is whether our galaxy formation models ac- 
curately reproduce this effect. Below we discuss whether 
they successfully model this shift in the genus curve. 

Figure El shows the (A v , A c ) plane with the SDSS data 
again shown as a solid blue square, with the random 
phase prediction shown as a black circle. Again, X's 
indicate measurements for the two regions of the SDSS 
considered separately. The data have values of (A V1 A C ) 
that depart from the random phase values of (1,1) due 
to biased galaxy formation and non-linear effects. 

6. TOPOLOGY OF SDSS DATA VS. SIMULATIONS 

The genus curve for the SDSS sample shows a clear 
shift towards a "meatball" topology and behavior in the 
void and cluster-dominated tails that indicate fewer iso- 
lated voids and clusters than expected from either a 
Gaussian random phase distribution or from perturba- 
tion theory (see Matsubara 1994 and Park et al. 2005). 
In this section we examine whether current simulations 
of large-scale structure reproduce these features and use 
the simulations to assess the statistical significance of 
these departures from random phase behavior. 

Our approach is to construct mock SDSS redshift sam- 



Topology of SDSS 9 



TABLE 1 

Genus statistics for SDSS and simulations thereof 



Name 


Amplitude 


Ai/ 


A v 


A c 


SDSS 


190.46 


-0.080 


0.747 


0.804 


DH 


190.79 ±11.65 


0.022 ±0.028 


0.806 ±0.052 


0.811 ±0.067 


MR 


175.42 ± 9.69(±9.86) 


0.010 ±0.023(±0.036) 


0.845 ±0.057(±0.081) 


0.862 ± 0.063(±0.097) 


Hydro 


146.23[±31.9](±31.9) 


-0.008[±0.106] (±0.107) 


0.783[±0.328] (±0.328) 


1.016[±0.218](±0.219) 



Note. — Errors not in parentheses are the standard deviations of the population in question (not given 
for the real SDSS or the Hydro simulation, where there is only one sample); errors in parentheses additionally 
include the effect of cosmic variation within a 1024/i -1 Mpc box, and the bracketed errors for Hydro represent 
cosmic variation for its box size within a 500/i — 1 Mpc box. 
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Fig. 6. — The same as Fig.|5| plotting A v and A c . The upper-left 
blue X corresponds to the SDSS region without the Sloan Great 
Wall. 



pies from each of the simulations that mimic the obser- 
vational selection effects caused by the survey geometry, 
sampling density of structure, and redshift-space distor- 
tions. In the DH and MR cases, we construct many such 
mock surveys, smooth and compute the genus curves, 
and compute the variables A, Az/, A v , and A c for each. 
Then we compute the mean and standard deviations of 
those statistics, as plotted in Figures El and El 

In considering the predictions of the various N-body 
simulations it is important to estimate the cosmic vari- 
ance one may encounter. We start with the DH simula- 
tion. This simulation has a volume of (614/z -1 Mpc) 3 , 
which is a bit over 16 times the volume of the SDSS 
sample, allowing roughly that many independent mock 
surveys. We create 100 such surveys of the SDSS in or- 
der to fully sample the structure in the cube with the 
irregular shape of the SDSS; they are clearly not inde- 
pendent, but the mean and distribution function of the 
genus statistics are not affected by this. We show the 
mean and standard deviations of the genus quantities 
for the Park et al. simulations in Figures and E| with 



a green hexagon and associated uncertainty limits. It is 
clear that the SDSS is more than 3a away from the mean 
value of Av. In fact none of the 100 mock surveys has 
a value of Av as negative as the observations. It could 
be argued that the SDSS contains the SDSS Great Wall, 
which is so unusual that this region should be excluded 
from the analysis. However, this is exactly the purpose of 
the 100 mock surveys: to examine the range of values we 
expect in SDSS-sized surveys. So the fact that the SDSS 
is outside the 2 sigma error bars from the cosmic variance 
expected shows that the DH simulations are not success- 
ful in predicting the observed meatball shift. Also, recall 
that the values of Av for both SDSS sub-samples (one 
not containing the Sloan Great Wall) are outside these 
limits. 

The DH simulations are perfect in amplitude relative to 
the data. Importantly, the DH simulations (as well as the 
Millennium Run and the Cen and Ostriker simulations) 
include the effects of baryon oscillations which give the 
correct initial power spectrum. A similar DH simulation 
with 8.6 billion particles but without baryon oscillations 
had an amplitude of 209 ± 11, which is 1.6<j high relative 
to the data (191). This shows that the topology can de- 
tect the presence of the baryon oscillations by measuring 
the slope of the power spectrum — simulations without 
them produce an amplitude of the genus curve that is 
too high. The values of Av = +0.020 ± 0.027 for those 
simulations without the baryon oscillations were not ap- 
preciably different from the ones that include the baryon 
oscillations. 

For the MR simulation, we have a volume of (500/z -1 
Mpc) 3 which enables us to make roughly 10 indepen- 
dent mock surveys of the SDSS. Again, to fully sample 
the cube, we make 50 mock surveys at random position 
and orientation. The mean and standard deviation of 
these mock surveys are shown in Figures |S1 and El as a 
red pentagon with associated error bars. This indicates 
the cosmic variance seen from SDSS-sized mock surveys 
drawn from the (500/z -1 Mpc) 3 survey region. 

However, we must additionally consider the effect of 
cosmic variance on the scale of the simulation region: 
error bars on simulations with smaller box sizes will be 
somewhat underestimated with respect to larger (e.g., 
the MR w.r.t. the DH), and with respect to the real uni- 
verse (which of course has infinite box size). To esti- 
mate this added variance, we take 8 sub-cubes of volume 
(512/i — 1 Mpc) 3 out of the larger DH simulation of Kim 
& Park (2007, in prep.) mentioned above, of volume 
(1024/z -1 Mpc) 3 , and make redshift maps of them; i.e., 
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we give the galaxies their correct x and y coordinates, 
but put them at a z coordinate equal to z + v z R^ x as if 
we were viewing a redshift space map of the survey region 
from a great distance. (Note that even with the differ- 
ent power spectrum, the variance of the genus statistics 
should still be correct.) We then compute the 3D topol- 
ogy for each of the 8 sub-cubes and measure the standard 
deviation of the 8 mean values of the parameters (Az/, A) 
and (Av,Ac). This cosmic variance of the 8 sub-cubes 
is added to the variance of the DH sample population 
(i.e., the standard deviations are added in quadrature) 
to produce an effective standard deviation we expect to 
find due to cosmic variance for SDSS-sized surveys in the 
MR if we had a larger (1024/z -1 Mpc) 3 simulation from 
which to draw mock catalogs. The increase in the stan- 
dard deviation is only a few percent for the amplitude A 
but ^ 50% for the other three genus statistics (see Ta- 
ble Q for both sets of values). We would have liked to 
make a similar set of sub-cubes for the (614/z -1 Mpc) 3 
DH simulation, but of course one cannot draw a statis- 
tically significant number of samples of that size from a 
(1024/z -1 Mpc) 3 box without significant overlap. 

The fact that the MR has a smaller volume also pro- 
duces a systematic effect in that it has less power in 
its power spectrum at large scales (no power beyond 
(500/z -1 Mpc) 3 ) so we should expect its amplitude A 
to be systematically a little large. Actually, the ampli- 
tude is lower than for the DH simulations and is about 
la low relative to the data so apparently the effects of 
cosmic variance in a sample this size are more important 
than the systematic effect caused by the lack of long- 
wavelength modes. 

For the Cen and Ostriker hydro simulation we have 
only one (120/z -1 Mpc) 3 simulation volume. This is 
about one-eighth the volume of the SDSS region so using 
its periodic boundary conditions we simply replicate it to 
make a volume large enough to encompass the SDSS. We 
then make a mock redshift catalog with the same abso- 
lute magnitude limits. Because the periodicity kills the 
large scale power we expect the simulation to be chop- 
pier than the real universe and therefore have a higher 
amplitude A than the observations. But even more im- 
portantly, we expect a large cosmic variance in a sample 
only as small as (120/i _1 Mpc) 3 . To model this we con- 
struct 64 sub cubes each of volume (120/z -1 Mpc) 3 from 
the MR simulation (of volume (500/i _1 Mpc) 3 ), and as 
above we make redshift maps of them (giving galaxies 
their correct x and y coordinates but put them at a z 
coordinate equal to z + VzHq 1 ). Then we compute the 
3D topology for each of the 64 sub cubes and measure 
the standard deviation of the 64 values of the parameters 
(Az/, A) and (A V ,A C ). This cosmic variance from the 64 
sub cubes produces the error bars surrounding the one 
value from the Cen et al. simulation (shown as a magenta 
X with error bars.) We may then add the extra cosmic 
variance as per the MR data point to produce the stan- 
dard deviations corresponding to a volume of (1024/z -1 
Mpc) 3 (again given in Table[l]— the extra st. dev. is < 1% 
for all statistics). These error bars are very large; this 
one simulation is not very constraining of the topologi- 
cal properties. Roughly speaking, the volume of the Cen 
and Ostriker simulation is an eighth that of the SDSS so 
we expect error bars that are roughly -\/S ~ 2.8 times as 



large as for the cosmic variance seen in the SDSS. This 
ratio is approximately correct as Figures 0& El show. 

In Figure we show the genus curve for the Cen and 
Ostriker simulation. It is not a good fit to the SDSS ob- 
servations. The top of the genus curve is chopped off, and 
it has a meatball shift in the void region that is not seen 
in the observations. In the central region — 1 < v < 1, 
where Av is measured, the curve is too fat, but also has a 
small negative value of Av = —0.01. This is more nega- 
tive than either the MR or the DH simulations, but is still 
not very close to the observed value of Av = —0.08. The 
amplitude of the Cen and Ostriker simulation is much 
lower than the other three data sets. Overall the Cen 
et al. simulation does not give a good fit to the data; 
however, the volume of the simulation is small and the 
large error bars show that the observations are within 
2 sigma in both Av and A. (To be fair, other hydro 
simulations by Cen and his collaborators — for example, 
Gott, Cen, & Ostriker 1996 — have produced good look- 
ing genus curves, so this one may be suffering from a bit 
of bad luck.) 

So, currently, the Cen and Ostriker hydro simulations 
are too small. They need to be increased in volume by 
about a factor of 10 to be fairly tested against the SDSS. 
In 1975 the largest N-body cosmological simulation had 
4,000 particles (Aarseth, Gott, & Turner 1975). In 1990 
the largest N-body cosmological simulation had 4 mil- 
lion particles (Park 1990). This led Gott to predict in 
1990 that by 2005 the record would be 4 billion parti- 
cles. Springel et al. (2005) did even better than this with 
over 10 billion particles. An increase of a factor of 10 
every 5 years is just what would be predicted by Moore's 
law (a doubling every 18 months). So given Moore's law 
we should expect that Cen and his colleagues will have 
a hydro code simulation with volume 10 times larger by 
2011. Then we can see if it outperforms the MR or the 
DH simulations. Just one additional hydro run with the 
same parameters would also be interesting — would it be 
closer to the observations (suggesting some bad luck in 
the current run) or further away? With the current error 
bars, the MR, DH simulations, and the Cen and Ostriker 
simulations are all consistent with each other within 2a. 

In Figure El we compare the results for A v and A c . 
Here the observations and the Millennium Run and the 
DH simulations are in better agreement. The observa- 
tions have values of (A v , A c ) = (0.75, 0.80) which depart 
from those of the random phase genus curve (1,1). As we 
have mentioned, values of A v < 1 on these scales are not 
produced by non-linear gravitational clustering but must 
be produced by biased galaxy formation (Park, Kim & 
Gott 2005). Here the Millennium run and the DH sim- 
ulations are both moved from the random phase value 
in the direction of the observations. The DH simulation 
does better, being nearly perfect in A c and just over one 
sigma away in A v . The Millennium Run is within one 
sigma in A c and around 1.7 a away in A v (neglecting cos- 
mic variance). Again, the Cen and Ostriker simulation 
is further away in A C: but its larger error bars leave it 
within 1 sigma in both A v and A c . 

A possible issue with all these simulations is that their 
initial conditions assume a value of as = 0.9 at the 
present epoch to be consistent with the fit for the WMAP 
first-year data (Spergel et al. 2003), but the WMAP 
three-year data prefer a lower value of a& = 0.76 (Spergel 
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et al. 2006). This implies less non-linear growth of clus- 
tering up to the present epoch, and more biasing. We 
can estimate the possible effect of using a smaller value 
of as by examining the genus statistics at a slightly earlier 
epoch in a simulation designed to reach linear as = 0.9 at 
z = 0. For example, at z = 0.5 (when as = 0.76) in the 
DH simulations, we find that dark matter halos with the 
same density as the z — halos have Av = 0.015, versus 
Av = 0.02. Compared to the discrepancy of Av between 
the SDSS data and simulations, this is an infinitesimal 
effect and we conclude that this does not appear to be a 
problem. We reach similar conclusions about the effect 
of as on A v and A c . The small size of these effects over 
this range of redshift (and so, over this same range of as) 
can also be seen in Figure 4 of Park et al. (2005). 

Another small but interesting effect has to do with 
halo or galaxy identification: nearby halos (or galaxies in 
the hydro simulation) may get merged together and only 
counted as one point. To estimate this effect, we calcu- 
lated the genus of SDSS after merging together all galax- 
ies closer than lOOkpc (an overestimate of the actual scale 
of the problem). This does seem to move SDSS slightly 
closer to the simulations: the amplitude decreased by 
- 0.5%, Av increased from -0.080 to -0.078, and A v 
increased by ~ 1%, while A c decreased by about the 
same amount. 

Figure 01 shows the genus curve of the observations ver- 
sus one of the 50 MR mock surveys picked at random, 
one of the 100 DH mock surveys picked at random, and 
the hydro mock survey. The DH simulation looks the 
best. The top of the genus curve near v = is not cut 
off and it looks most like the observations. Both the Mil- 
lennium Run mock and the hydro mock run are cut off at 
the top in the same way, the hydro one more so. To get 
a better picture of the cosmic variance, Figure [3 shows 
the observations compared to hatched bands showing the 
la variation in the mock runs from the (a) DH and (b) 
MR simulations. It is clear from this that the one ran- 
dom Millennium run mock survey shown in figure 21 was 
worse than average at fitting the top of the curve, but it 
is also clear that the DH simulation is still better than 
the Millennium Run at fitting the observations because 
the band of Millennium run simulations are lower in this 
region. The place where the Millennium Run mocks and 
the DH mocks fail the worst is in the region 0.4 < v < 1.2 
where the observations are consistently shifted to the left 
with respect to the simulations. 

We conclude that the simulations do an adequate job of 
representing the topology in all the variables except Av, 
which for the SDSS data lies more than 2.5<j away from 
either the MR or DH simulations. Of the 100 DH mock 
surveys the one closest to the observations in terms of 
the four variables and their error bars is mock catalog 94 
which has Av = -0.02, A = 191, A v 0.80, A c 0.77 
which are OK in all except Av where it is still far off the 
observational value. In fact of the 100 mock DH simula- 
tions the two most negative in Av are between —0.05 and 
—0.04 while the observations show —0.08. The four most 
negative of the 50 mock Millennium Run simulations are 
between —0.03 and —0.02. Interestingly, the higher spa- 
tial resolution and semi-analytical modeling of the MR 
does not yield a better fit to the observed topology of 
large-scale structure than the dark-matter, halo-finding 
DH simulations. 



7. CONCLUSIONS 

The SDSS dataset has now become large enough that 
the topology of large-scale structure can be used for more 
detailed model testing. We find that the SDSS observa- 
tions have a sponge-like median contour and follow fairly 
closely the genus curve expected from Gaussian random 
phase initial conditions predicted by inflation. We quan- 
tify departures from from this theoretical curve that pro- 
vide key tests of models for galaxy formation, as repre- 
sented by the several simulations that we examine. 

The amplitude of the genus curve is in agreement 
with that predicted from the standard AC DM model 
(with the WMAP parameters) with baryon oscillations 
included. If baryon oscillations were not included the 
fit to the amplitude would be significantly worse (1.6a). 
The observed values of A v and A c are predicted well by 
both the MR and DH simulations. Both show the effects 
of non-linear gravitational evolution and biased galaxy 
formation. The Cen and Ostriker hydro simulations are 
consistent with the data, but their small volume gives 
them large error bars and they are currently not giving 
values closer to the observations than the MR or DH 
simulations. 

The most notable feature of the observations is a meat- 
ball shift Av = —0.08 showing a slight prominence of iso- 
lated high density regions over isolated voids. The SDSS 
Great Wall is one large connected isolated high density 
region and contributes to this effect, but the effect also 
shows up in the northern part of the SDSS which does 
not contain the Sloan Great Wall. If the Sloan Great 
Wall were entirely responsible for this result one might 
argue that it was produced by rare objects whose fre- 
quency of occurrence was determined by the power in 
the initial conditions at very large scales and that even 
larger simulations > (1024/i _1 Mpc) 3 would be needed to 
properly test for this effect. But this is not the case. 
Negative values of Av = —0.08 show up even in the part 
of the survey that does not include the SDSS Great Wall. 
Also, slice surveys of the MR simulation show great walls 
that look quite impressive — if not quite as dramatic as 
the Sloan Great Wall. The observed Av values are more 
than 2.5<j away from the values found in the MR and 
the Dark Matter Halo simulations. This is a severe test 
for large N-body simulations and their heuristic galaxy 
formation scenarios because these were not tuned to ac- 
count for topology. 

The slight meatball shift seen in the observations has 
been noticed in previous observational samples with sim- 
ilar smoothing lengths (6/z -1 Mpc), being mentioned first 
by Gott et al. (1989). The large survey by Canaveses et 
al. (1998) of the IRAS galaxies looks Gaussian random 
phase on all larger smoothing lengths, but does have a 
slight meatball shift with a smoothing length of bh~ x 
Mpc. Hydrodynamic simulations suggest that early type 
galaxies should show more of a meatball shift than late 
type galaxies (Gott, Cen, & Ostriker 1996) and this effect 
has already been observed in the equatorial slice of the 
SDSS in a 2D topology survey by comparing the relative 
meatball shift between red and blue galaxies (Hoyle et 
al. 2002). 

As the SDSS Digital Sky Survey is completed, the gap 
between the northern and equatorial slices will be closed 
giving us one large continuous volume, where the fraction 
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(a) DH 

Fig. 7. — Genus curves with shaded lcr error regions for the (a) 
random phase. 



of the sample one throws away because of closeness to the 
edge will be diminished. This will approximately double 
the effective volume of the sample and give us a still bet- 
ter test. Also, studies with smoothing lengths of 10/i — 1 
Mpc and 20/z -1 Mpc will be possible with high precision 
allowing more direct tests of the Gaussian random phase 
hypothesis on scales where the galaxy formation effects 
are less important. 

It would be interesting to see N-body simulations cov- 
ering larger volumes which would have more power at 
large scales (because they would not be artificially cut 
off at the box size). This would make for more accurate 
modeling of the structure and freque ncy of occurrence of 
structures like the Sloan Great Wall ([Gott et alJ l2Q05). 

The results here suggest that in order to account for the 
observed topology some changes in galaxy formation sce- 
narios are called for. We look forward to improvements 
in the N-body simulations. Of particular interest is how 
well larger hydrodynamic simulations will perform when 
compared with larger samples, and whether there will be 
a convergence of predictions as both hydrodynamic and 
merger tree, and dynamical halo occupation methods are 
improved. Galaxy formation is not yet a solved problem 
in cosmology and the 3D topology offers a strong test of 
models which is independent of other measures. 
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